Unsupervised training of an HMM-based speech recognizer for topic classification
نویسندگان
چکیده
HMM-based Speech-To-Text (STT) systems are widely deployed not only for dictation tasks but also as the first processing stage of many automatic speech applications such as spoken topic classification. However, the necessity of transcribed data for training the HMMs precludes its use in domains where transcribed speech is difficult to come by because of the specific domain, channel or language. In this work, we propose building HMM-based speech recognizers without transcribed data by formulating the HMM training as an optimization over both the parameter and transcription sequence space. We describe how this can be easily implemented using existing STT tools. We tested the effectiveness of our unsupervised training approach on the task of topic classification on the Switchboard corpus. The unsupervised HMM recognizer, initialized with a segmental tokenizer, outperformed both the a HMM phoneme recognizer trained with 1 hour of transcribed data, and the Brno University of Technology (BUT) Hungarian phoneme recognizer. This approach can also be applied to other speech applications, including spoken term detection, language and speaker verification.
منابع مشابه
Unsupervised training of an HMM-based self-organizing unit recognizer with applications to topic classification and keyword discovery
We present our approach to unsupervised training of speech recognizers. Our approach iteratively adjusts sound units that are ptimized for the acoustic domain of interest. We thus enable the use of speech recognizers for applications in speech domains here transcriptions do not exist. The resulting recognizer is a state-of-the-art recognizer on the optimized units. Specifically we ropose buildi...
متن کاملImproved topic classification and keyword discovery using an HMM-based speech recognizer trained without supervision
In our previous publication [1], we presented a new approach to HMM training, viz., training without supervision. We used an HMM trained without supervision for transcribing audio into self-organized units (SOUs) for the purpose of topic classification. In this paper we report improvements made to the system, including the use of context dependent acoustic models and lattice based features that...
متن کاملExplorer Unsupervised cross - lingual speaker adaptation for HMM - based speech synthesis
In the EMIME project, we are developing a mobile device that performs personalized speech-to-speech translation such that a user’s spoken input in one language is used to produce spoken output in another language, while continuing to sound like the user’s voice. We integrate two techniques, unsupervised adaptation for HMM-based TTS using a wordbased large-vocabulary continuous speech recognizer...
متن کاملAn effective feature compensation scheme tightly matched with speech recognizer employing SVM-based GMM generation
This paper proposes an effective feature compensation scheme to address a real-life situation where clean speech database is not available for Gaussian Mixture Model (GMM) training for a model-based feature compensation method. The proposed scheme employs a Support Vector Machine (SVM)based model selection method to effectively generate the GMM for our feature compensation method directly from ...
متن کاملAdvanced training methods and new network topologies for hybrid MMI-connectionist/HMM speech recognition systems
This paper deals with the construction and optimization of a hybrid speech recognition system that consists of a combination of a neural vector quantizer (VQ) and discrete HMMs. In our investigations an integration of VQ based classi cation in the continuous classi er framework is given and some constraints are derived that must hold for the pdfs in the discrete pattern classi er context. Furth...
متن کامل